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Abstract 

The contact structure between hosts has a critical influence on disease spread. However, most network- 
based models used in epidemiology tend to ignore heterogeneity in the weighting of contacts. This as- 
sumption is known to be at odds with the data for many contact networks (e.g. sexual contact networks) 
and to have a strong effect on the predictions of epidemiological models. One of the reasons why models 
usually ignore heterogeneity in transmission is that we currently lack tools to analyze weighted networks, 
such that most studies rely on numerical simulations. Here, we present a novel framework to estimate key 
epidemiological variables, such as the rate of early epidemic expansion (r ) and the basic reproductive ratio 
(i?oX from joint probability distributions of number of partners (contacts) and number of interaction events 
through which contacts are weighted. This framework also allows for a derivation of the full time course of 
epidemic prevalence and contact behaviour which is validated using numerical simulations. Our framework 
allows for the incorporation of more realistic contact networks into epidemiological models, thus improving 
predictions on the spread of emerging infectious diseases. 

1 Introduction 

Contact structure between hosts is known to have a key influence on disease spread HI. A striking result is 
for instance that the more heterogeneous the contact network is, i.e. the higher the variance in the number of 
contacts per individual, the more rapid the initial disease spread. 

One way to capture contact structure is to use a network Q. Such contact networks are typically described 
by a square binary adjacency matrix, where each term on the ith line and jth column can take the value or 1 to 
indicate respectively the absence or the presence of a contact between individuals i and j. The sum of the rows 
(or the columns) indicates the total number of contacts of an individual (i.e. the node's degree in the network). 
Contact networks are widely used because they have several valuable properties, one of which being that the 
dominant eigenvalue of the adjacency matrix is an indicator of the early propagation of an infectious disease 
spreading on this network 0IUI51. 

Insights into epidemic thresholds can be gained through the distribution of the number of contacts (degrees). 
The number of secondary infections generated by a typical infected host in a fully susceptible population, i.e. 
the basic reproductive number Rq scales with the ratio of the second moment (k 2 ) and first moment (mean) 
(k) of the distribution in the number of contacts k. This result holds both for static networks (denoted i?Q* a ')[6] 
as well as for fully mixed, dynamic networks (denoted R™ lx ) 00. The static case corresponds to networks in 
which the identity of contacts is fixed (as approximatively seen in sexual contact networks) and the fully mixed 
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dynamic case corresponds to a situation in which individuals update their contacts dynamically in a fully mixed 
fashion within the population (as approximatively seen in airborne infections on small geographical scales). 
The number of contacts follows the distribution of k for both network types. Regarding disease spread, it has 
been shown that: 



where a? = (k 2 ) — (k) 2 is the variance of the distribution of the number of contacts. 

R tat and i?™ x represent the lower and upper bounds of the basic reproductive ratio [9] for SIR epidemics 
on random networks if individuals transmit the infection at a rate /3 and recover from the infection at a rate 
7. Findings for both static and dynamic networks imply that heterogeneous networks with a large or even 
diverging variance in the distribution of the number of contacts have a very small or even vanishing epidemic 
threshold. 

One of the typical key assumptions epidemiological models on networks make in order to derive such 
elegant expressions for Rq is that the transmission rate is the same between all pairs of individuals. This is 
materialized by the fact that all the edges of the contact matrix have the same weight (0 or 1). In reality, 
however, this is known to be a simplifying assumption [10]. Just to take one example related to infectious 
diseases, in the case of sexual contact networks, the number of sex acts per unit of time is not constant in 
all partnerships iTTTl IT2l |T3l . More generally, the number of interaction events (which correspond to potential 
transmission events) may vary among contact pairs. Simplifying the reality is commendable as long as it does 
not modify the conclusions of the model. The problem here is that tempering with the weighting of the network 
has been shown to lead to the loss of important epidemiological properties of heterogeneous networks, such 
as the low value of the epidemiological threshold or the negative correlation between network size and the 
epidemiological threshold value lTT4ll . 

An increasing number of studies in epidemiology point to the importance of considering weighted networks. 
Some examples include the spread of sexually-transmitted infections 031, disease transmission in sheep flocks 
031, respiratory diseases of humans |[T6l or general infectious diseases of human spreading on a social contact 
network [17 ] or on airline connection networks ifTSll . Several more conceptual studies have also been published 
in the theoretical physics literature (e.g. fl2j [19] |20l). Most of these studies have in common that they use 
weighted networks and resort to (heavy) numerical simulations. Indeed, contrary to unweighted networks, we 
lack analytical frameworks to analyze epidemic spread on weighted networks. 

Amongst earlier epidemic models on weighted networks, one approach stands out ETTl . The authors de- 
scribe the early epidemic expansion in a Reed-Frost model where infections take place in discrete time steps, 
with non-overlapping generations and each infected individual recovers with certainty one time step after in- 
fection. These simplifications allow them to obtain an analytical expression of Rq and to assess outbreak prob- 
abilities using branching processes. In their formalism, as shown in f22|. Ro can be derived as the dominant 
eigenvalue of the mean offspring matrix (m^j c >2), where m^k represents the expected number of individuals 
with k contacts that an individual with d contacts infects considering potentially degree-dependent network 
weights. Importantly, it is only because the authors make strong simplifying assumptions in their model, such 
as the independence between network weights and nodes' degrees, that they can derive an explicit form of Rq. 

Here, we present an original framework, which builds on configuration type network epidemic approaches 
Il23ll24l to model the dynamics of a disease spreading on a weighted network and to estimate key epidemiolog- 
ical variables from the network's properties. Our framework does not require strong simplifying assumptions 
regarding the epidemiological process or the distribution of weights on the contact network to provide explicit 
expressions for Rq or epidemic prevalence. It goes beyond earlier frameworks describing continuous time SIR 
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Figure 1 : Not only is the number of contacts that an individual maintains relevant for the spread of an infectious 
agent but also the weight that (s)he assigns to each contact. Here, each individual has I interaction events that 
(s)he can distribute among his/her k contacts. On the scale of the transmission network, these are modelled 
by the joint probability distribution Pjy to find an individual with k contacts and I interaction events per time 
interval. 



epidemics on weighted networks by providing explicit expressions for the rate of early epidemic expansion (ro) 
and the basic reproductive ratio (Ro) of the infection. It also allows for a derivation of the full time course of 
epidemic prevalence and contact behaviour of susceptible, infected and recovered individuals (in terms of the 
probability generating functions (PGFs) of the degree distributions). As sketched in Fig. [T] the parametrisa- 
tion is done in a general fashion using the joint probability distribution P^i of an individual to have k contacts 
among which (s)he randomly distributes I interaction events. We validate our analytical results using numerical 
simulations. 

2 The Model 

A susceptible individual has k contacts and I interaction events per time interval (which are distributed among 
his/her k contacts). This individual can get infected at a rate proportional to the number of interaction events (I), 
the transmission rate of the pathogen (j3) and the probability for each of his/her contacts to points to an infected 
individual (psi)- The population dynamics of susceptible, infected and recovered individuals with k contacts 
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and I interaction events per time interval are thus captured by the following set of differential equations: 

Sm = -PpsilSki (2a) 
hi = PpsiI Sh ~ 7 hi (2b) 
Rh = ihh (2c) 

where 7 is the rate at which recovery from infection occurs (see Table [T]). 

The dynamics of the total number of susceptible, infected and recovered individuals can be obtained by 
summing over k and I. This leads to: 

S = -Ppsi(1)sS (3a) 
i = (3 PS i(l)sS-jI (3b) 
R = 7 1. (3c) 

where (l)s = J2k i^Ski/S = Gg\l, l,t) is the average number of interaction events per time a susceptible 
individual has. 



Table 1 : Notations used in the study. 



f(x,t) = §ff{x,t) partial derivative of function / with respect to t 

f(a,b) ^ ^ _ 9^__^_j(^ x ^ ^ partial derivative of function / with respect to x 

Aki number of individuals in group A with k contacts and I 

(potential)transmission events (per time interval) 
A = J2k 1 Am number of individuals in group A 

Nki = Xm A-ki number of individuals with k contacts and I transmission 

events (per time interval) 
N = Y^k 1 Nki total number of individuals 

Pah = probability for an individual in group A to have k contacts 

and I transmission events per time interval 
G A (x, y, t) = J2k,i PAki(t)x k y l probability generating function (PGF) of Paw(*) 

{U)a = G^' ' 1 (1, 1, t) average number of contacts of A individuals 

(I) A = G^' 1 ^ (1, 1, t) average number of transmission events per time interval of 

A individuals 

(hi) a = G A (1, 1, t) average number of contacts times transmission events per 

time interval of A individuals 
Pki = -# probability for an individual to have k contacts and I (po- 

tential) transmission events per time interval 
G(x, y, t) = Y,k,i Pki{t)x k y l = Y,a ^9a(x, t) probability generating function (PGF) of p k i(t) 
(k) = G?( 1,0 )(l, 1, t) average number of contacts of individuals 

M A = Efc 1 kA ki = AG^'°\l, 1, t) number of links coming from A individuals 

M = Y^a Ma number of links 

Mab number of links coming from A individuals and pointing 

to B individuals 

Pab = probability for a link starting from an A individual to point 



to an B individual 



A,B correspond to epidemic stages, i.e. S, I, R for susceptible, infected, recovered 



4 



To close the equation system [3] we need expressions for the temporal dynamics of the pab, i-e. the prob- 
abilities for a status A individual's contact to be with an individual of status B. These can be derived through 
a careful assessment of the links/contacts among susceptible and infected individuals over the course of an 
epidemic. This means following the dynamics of the joint probability distribution to find k contacts and I inter- 
action events per time among susceptible individuals Psh = Ski/S through its PGF Gs{x, y,t). The temporal 
dynamics of the probability generating function Gs{x, y, t) can be calculated by observing that the dynamics 
of the corresponding joint probability distribution of contacts and interaction events of susceptible individuals 
are governed by the equation Psm = — %Pskl- As shown, in detail in the Supporting Information, we close 
equation system [3] with the following equations: 

PSI =PSI [P PSS TTV^- ~ P (1 - PS I ~ PSS) TfT- ~ 7 J ( 4a ) 
V («)s {k)8 ) 

(kl) s -2(l) s 

Pss = -P Psi Pss ttt (4b) 

( k )s 

G s (x,y,t)=/3 Psi ((l) s G s (x,y,t)-yGp 1 \x,y,t)). (4c) 
For furthers details about the terms in this equation system, see Table [T] 



3 Validating the analytical model 

The joint probability distribution of the number of partners and number of interaction events (Pm) can be written 
as the product P k i = P k P t \ k , Pi\ k being the probability distribution of the number of interaction events per time 
/ given that the individual has k contacts. If Pi\ k = 8\k, where Sik is 1 if I = k and otherwise, we are back 
to a 'classical' network case, with an exact linear dependency between the number of contacts and the number 
of interaction events (for a detailed discussion, see the Supporting Information). Our framework allows us to 
model more general situations by explicitly choosing Pn k . 

In order to test our analytical model, we considered epidemiological dynamics taking place on artificial net- 
works on which we release the constraints of linearity found in 'classical' networks. To create these networks, 
we used combinations of Poisson and power law distributions for the number of contacts k and interaction 
events per time I. This allows us to introduce arbitrary combinations of homogeneous or heterogeneous be- 
haviour in the way contacts are made and in the number of interaction events established, that may be either 
independent or dependent (as in the linear case). 

We generated four types of networks corresponding to the combinations of homogeneous and heteroge- 
neous behaviour in the number of contacts k and the number of interaction events I as well as the corresponding 
networks with a linear dependency between k and I. We studied the spread of an infectious agent on these 
artificial networks using the analytical approach by plugging the corresponding joint probability generating 
functions into equations [3]|4] (further details concerning the PGFs used for each network are given in the Ma- 
terials and Methods section). In addition, we ran numerical simulations. For these simulations, networks were 
generated by assigning each host a number k of 'half-contacts' (stubs) and I interaction events per time interval 
drawn from the distribution P k \. Each host then distributed his/her interaction events equi-probably at random 
(i.e.multinomially) among his/her k contacts. P k i was chosen to suit {I) = 2(k) corresponding to a timescale 
in which a host has on average 2 interaction events per contact. 

The next step in the generation of a configuration type network consisted in randomly matching half- 
contacts (stubs) respecting their assigned number of interaction events per contact. This may introduce topo- 
logical constraints on the network that can lead to network segregation and assortative effects that are not seen 
in the analytical approach per se due to its node-centric view (see Supporting Information). 
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If the number of interaction events per contact is large ((l)/(k) 3> 1), hosts nearly equi-distribute their 
interaction events among their contacts due to convergence under the law of large numbers. Exact matching of 
the host's number of interaction events per contact and time segregates the network into subnetworks according 
to this number (weight assortativity). A more realistic network is obtained by decreasing (I) and thereby 
introducing some variability in the number of interaction events among a host's contacts which simulates some 
flexibility in assigning interaction events at short time scales. In addition, we also introduced some tolerance 
in 'negotiating' the number of interaction events per contact, which leads to a slightly modified empirical 
joint probability distribution (denoted Pki with PGF G(x, y)) which is discussed in detail in the Materials and 




Figure 2: Dynamics of the number of infected hosts (/) during epidemic spreading on different types of net- 
works. The distributions in the number of contacts (fc) and interaction events per time (0 are either homogeneous 
(Poisson) or heterogeneous (power law) For the number of interaction events, we also show the linear case in 
which i is strictly proportional to the number of contacts k. k and i are. drawn from joint distributions p kl with 
(i)=2(k) (except for the analytical p kl model's linear case where i=k being compensated by a double transmission 
rate.). The figures show the outcome of the simulation runs (grey, dotted lines), of the the numerical solution 
of the analytical model with p kl (red, solid line) and P kl (red, dashed line). In addition, we show the epidemic 
prevalence when excluding individuals with only one contact (fc=i) which is relevant for epidemics on networks 
with heterogeneous numbers of contacts including many individuals with k=i in combination with a (nearly) 
constant number of interaction events, as realised through a Poisson distribution (orange line, cf. specifically 
Pk power law, Pi Poisson). Parameters chosen correspond to (k)=4 (Poisson case: (fe>=4, (0=8, power law case: 
\ k =iA, A(=o.89, ft fe =K;=22). Epidemiological parameters are /3=o.oi (0.02 for the analytical p ki model's linear case), 
7=0.004 in arbitrary units and /(o)=20. Early epidemic expansion is further captured by exponential growth with 
i(o)e r o l (black, dashed line). 
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Methods. Weight assortativity in terms of l/k leads to a network segregation that constrains epidemics but it 
might at the same speed up the initial expansion of an epidemic as compared to the prediction of the analytical 
approach. This situation arises in networks with a (nearly) constant number of contacts k per individual among 
which a heterogeneous number of interaction events I is distributed leading to an early expansion among mostly 
highly active individuals. Alternatively, the network can also segregate with respect to the number of contacts 
an individual holds (contact or degree assortativity). An extreme case can be observed when a (nearly) constant 
number of interaction events has to be distributed among a heterogeneous number of contacts. This leads to 
a (near) isolation of individuals with single contacts from the epidemic process (for details cf. Supporting 
Information). 

To validate the model, we compared the epidemic prevalence (I) from repeated simulation runs with the 
results derived from the analytical approach using the probability generating functions corresponding to P^i 



and P kl , G(x,y) and G(x,y) (Note that P kl (0) = P sk l(0), P k i(0) = Jfc w (0), G(x,y,0) = G s (x,y,0) and 



G(x, y, 0) = Gs(x, y, 0).) The epidemiological dynamics are summarized in Fig. [2]through a comparison of 
the simulation results with solutions of the analytical approach for P k i (G(x, y)), P k i (G(x, y)) and an approx- 
imation (applied to P k i) excluding individuals with one contact as relevant for networks with heterogeneous 
number of contacts and (nearly) constant number of interaction events per individual (contact assortativity). 
The analytical approach represents the simulation results well if the constraints of the specific networks topol- 
ogy are properly taken into account. A particularly interesting observation is that epidemics in networks with 
heterogeneous contact behaviour slow in independently weighted networks (P k i = P k Pi) as compared to the 
case of 'classical, linar' networks in which the number of interaction events scales with the number of contacts 
for each individual. 

4 Capturing epidemic characteristics (r and r q ) 

There are different ways to assess the initial propagation of an infectious agent in an otherwise fully susceptible 
population. One possibility is to estimate the initial exponential growth rate tq in the number of infected indi- 
viduals. Another possibility consists in estimating the number of secondary cases created by a newly infected 



Table 2: ro vs. Rq 



Early epidemic growth rate ro 



Basic reproductive ratio Rq 



Epidemic ex- 
pansion from 
randomly picked 
index case, (mean 
field approxima- 
tion) M S i ~ I(k) 



,MF 



I 



03(0 



Early epidemic 
expansion, struc- 
ture set by epi- 
demic, Msi ~ 
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host in a fully susceptible population, which is classically referred to as the basic reproductive ratio Rq (T). 

Furthermore, subtle effects arise depending on whether we take the neighbourhood of a random individual 
as a reference or the neighbourhood of a 'typically' infected individual. The first case (using a random indi- 
vidual as reference) corresponds to what is usually referred to as a 'mean field approximation' and describes 
the very first infection events. The second case (using a 'typical' infected individual) is more appropriate to 
capture the next stages of early epidemic expansion because it accounts for the fact that spatial structure has 
been sensed or set by the epidemic process and is considered for the derivation of Rq. 

The expressions for ro and Rq for the SIR model are shown in Tab. [2] and are derived in detail in the 
sections on Materials and Methods and the Supporting Information, respectively. Note that these do not in- 
volve any approximation beyond those implied in the model's assumptions, i.e. they are exact within the model 
framework. The derivation for Rq in the case where 7 > is the only one that required some further approx- 
imations to obtain an explicit formula (cf. Supporting Information). Fig. [2] shows that the exponential growth 



rate ro = y ^jy — 2 J ^ f3 — 7 calculated for the early epidemic expansion corresponds well to the simulation 
and analytical results. The expressions for ro and Rq scale with the second moment (kl) of the joint probabil- 
ity distribution P^i quantifying how correlations between the number of contacs k and interaction events I an 
individual maintains affect epidemic spreading. 

5 Real world scenarios - application to epidemiological data 

The knowledge of transmission networks along which an infectious agent can spread within a host population 
is of great importance to public health. These networks might be hard to define for air-borne infections because 
they are very dynamic ll25l |26|| but easier to define for sexually transmitted infections (STI) because they are 
rather static. Such sexual contact networks have been surveyed in many studies covering homosexual as well 
as heterosexual populations and different societal contexts |[27ll28ll29ll30l to understand and confine the spread 
of STI. The National Survey of Sexual Attitudes and Lifestyles (NATSAL EBl ) provides very detailed data 
on the situation in the United Kingdom including distributions in the number of sexual partners k and sex acts 
(interaction events I) a person has within certain time frames. As shown in Fig.|3]\, both the number of partners 
(contacts k) and sex acts (interaction events) / an individual has are heterogeneously distributed. However, their 
joint distribution P^i does not show a linear behaviour, meaning that the number of sex acts I does not scale 
linearly with the number of partners k an individual has. Still, as pointed out in Ifl4ll . many epidemic modelling 
studies on unweighted networks rely on this assumption. The linearity assumption in combination with the 
broad distribution found in the number of sexual contacts and sex acts results in predictions of very fast early 
epidemic expansion and an epidemic threshold that is potentially vanishing in the limit of infinite network size 
(I EH EE UllH as can be seen from equation [T] 

In our 'Validation' section we have shown that, regarding the number of interaction events, a deviation 
from the linear behaviour results in a decrease in epidemic expansion and peak prevalence, especially for 
transmission networks that are characterized by a heterogeneous distribution in the number of contacts per 
individual k and regardless of whether the distribution of interaction events/sex acts is homogeneous (Poisson) 
or heterogeneous (power law). This behaviour is also reflected in Fig. [3] which shows the epidemic expansion 
of a susceptible-infected epidemic with transmission rate (3 = 0.01 per sex act in two network scenarios. The 
first scenario (red curve) builds on a linear contact network only defined by the contact distribution P^, in which 
the number of sex acts grows linearly with the number of contacts respecting the average number of sex acts 
((I) = 5.59 during 4 weeks). The second scenario (blue curve) takes the joint probability distribution P^i into 
account and evaluates the epidemic on the resulting weighted network. Only by looking at the exponential 
growth rates of the epidemics, which are ro = 1.111 per year and ro = —0.1625 per year, respectively, one 
can conclude that the former scenario supports epidemic expansion, while the latter does not beyond a few 
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Figure 3: A) Characteristics of the sexual contact network inferred from the NATSAL contact tracing study 
and B) Dynamics of SI epidemics spreading on sexual contact networks. The network of heterosexual contacts 
shows a heterogeneous joint probability distribution p kl for an individual to have k contacts and l sex acts (in- 
teraction events) as derived from NATSAL data [28], which holds analogously for the marginal distributions P k 
and Pi (panel A). On Panel B, SI epidemics with transmission rate per sex act of /3=o.oi along a weighted net- 
work of the sexual contacts as defined by p kt (blue, dashed line corresponds to early exponential approximation 
i(o)e rt ) and along a classical network with degree distribution p k (red, dashed line corresponds to early expo- 
nential approximation J(o)e r o*). Note that the epidemic is self-maintained on the classical network (r >o) while 
it is not on the weighted network (r <o). In the latter case, only a minor epidemic spreads around randomly 
infected hosts as r$f F exceeds zero for these initial nodes (grey line, dashed, mean field-approximation r$ IF ). 

infections caused by the initially randomly infected individuals (r(f F = f3(l) = 0.1679 per year in the mean 
field approximation, grey dashed line). Although the survey data shown in Fig. [3]\ only provides us with a 
rough picture of the real transmission network and although the number of partners during one year can only 
approximately be considered as being concurrent, the data are sufficient to confirm a remarkable reduction 
in the epidemic potential when shifting from a classical unweighted transmission network towards a more 
realistic weighted transmission network. This finding is in particular consistent with an earlier simulation study 
on epidemic spreading along a network of homosexual contacts 1131 . 
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6 Discussion 



Network theory and specifically network epidemic models have broadened our understanding of the spread 
of infectious agents - or other entities such as information, money, travellers or goods - in complex settings. 
In their simplest form, they do not consider that contacts may show variability in their transmission capacity. 
But the probability of disease transmission along a contact strongly depends on the intensity of the contact, 
tansportation links vary in their throughput and information may not be shared equally among all possible 
channels. Although earlier studies have shown that this weighting of contacts has non-negligible impact on 
the spreading dynamics |[T4ll modelling of of epidemics on weighted networks largely focuses on simulation 
studies lfT71l33l[T4l . mean field approximations P41I351 or discrete time dynamics [21, 22] providing explicit 
expressions for epidemic characteristics such as the basic reproductive ratio Rq only in special cases. 

We extend these findings by providing a framework based on partial differential equations that allows to 
model continuous time SIR epidemic dynamics for very general weighted networks defined through the joint 
probability distribution for an individual to have k contacts and / interaction events. From this we are able to 
derive the full epidemic dynamics in terms of the number of susceptible, infected and recovered individuals 
over time as well as explicit expressions for the basic reproductive ratio Rq and the exponent of early epidemic 
expansion tq. The application of our method to epidemics on artificial and empirically motivated networks 
corresponds well to simulation results on these networks. Moreover, it reveals the impact of assortative effects 
introduced by contact weighting on epidemic dynamics which will need closer attention in future research. 



7 Materials and Methods 

7.1 Analytical results and their validation 

The set of differential equations describing the epidemic process is derived by careful bookkeeping of the links 
along which an infectious agent spreads and a detailed derivation is provided in the Supporting Information. 
The analytical approximation assumes that an individual distributes his/her i interaction events multinomially 
among his/her fc contacts and is infected at a rate proportional to his/her average number of interaction events 
with i infected contacts. This averaging implies the choice of a time scale on which (i)>(k). As the analytical 
approach is node-centric it does however not consider the constraints for half-contacts to match half-contacts 
with similar weight leading to an unrealistic network segregation in some artificial networks for (i)>(k). This 
network segregation changes epidemic dynamics in these networks which is not seen in the analytical approach 
per se but can be considered through corrections in the analytical approach. 

To validate the analytical approach, we look at combinations of Poisson and power law distributions for the 
number of contacts k and interaction events per time i. As we neglect isolated hosts, we look at the Poisson dis- 
tribution p n = i 1 ~^° ) ^r e ~ <n> with support for n>o and probability generating function g(x)= e ^"^^"^ and 

— A — — L 'A ( xe ~ S 1 

power law distributions with exponent A and cut-off k, p n = " e " and g(x)= — y — ^ (normalisation through 

the Polylogarithm Li\) for homogeneous and heterogeneous behaviour, respectively. If the joint probability 
distribution P^i is given by the product Pki = PkPi of independent distributions with PGFs g±(x) and g2 (y) 
their joint PGF G(x, y) is also given by the product G(x, y) = g\(x)g2{y). In the linear case with Pjy = P^u 
the PGF G(x, y) is given by G(x, y) = gi(xy). 
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7.2 Derivation of r and R 



The derivation of tq is based on the observation that the rate of epidemic expansion as described by equations 
papcj ) is proportional to the number of links/contacts between susceptible and infected individuals Msi, i.e. 

PsiS(l) s = ^§S(l) s = M SI §^^ M S i$, with M s = SG$'°\l,l,t) = S(k) s being the number of 
contacts of susceptible individuals. 

In the mean field approximation, Msi is approximated by the number of infected individuals (/) and the 
average number of contacts per individual found originally in the total population ((k) = G( 1,0 )(l, 1, 0)). As 
soon as the epidemic is set, Msi is evaluated from the product of / and a slightly more sophisticated estimate 
of the number of contacts of infected hosts: each infected node contributes to Msi by the average excess degree 

of a recently infected node ^oiiw^ — 1 = i^r ~~ ^ ^ or details, see the Supporting Information). This means 
discounting the contact the infection has spread from and assuming that all 'new' contacts are still susceptible 
in the early phase of an epidemic. Considering that the contact along which the epidemic spreads further also 
needs to be discounted results in Msi ~ I (§(M)jjy^) — 2^ = / — 2^ . 
All together we have, 



i = rp S i(l) s S --yl 




Approximation (|5]> corresponds a "mean field approximation" representing the neighbourhood of a randomly 
picked node, i.e. not a node picked according to its number of interaction events per time interval. Approxima- 
tion ([6]> considers that an infected individual has been picked with a probability proportional to its number of 
interaction events per time interval. The doubling time to can be derived from ro as tr> = ln(2)/ro. 

The basic reproductive ratio Ro is the average number of secondary infections of a typical infected host in 
a fully susceptible population. As for SIR models on classical random networks, it is derived by first evaluating 
the distribution of excess contacts of a typically infected host, i.e. the probability for a node chosen proportional 
to its number of interaction events per time (I) to have k excess contacts. This probability is Qm = i) • 

Ro is evaluated in the Supporting Information as the number of infections that spread along these excess con- 
tacts before recovery of the typically infected host. 



7.3 Simulations 

Networks for simulation were generated by first generating 10000 nodes with k half-contacts (stubs) and I 
interaction events as drawn from the probability distribution P^i- Each nodes I interaction events are then 
distributed multinomially among its k stubs. Stubs are randomly matched with matches being rejected if the 
stubs' weights differ by more than one interaction event or more than 10%. If the stubs with non-identical 
weights are matched, the contact is assigned the mean weight randomly rounded to the next integer. This 
results in the empirical distribution P^u 

We use the Gillespie direct algorithm f36l to run stochastic SIR epidemics on continuous time. For each 
susceptible node i, transmission occurs at rate fiWj i where /3 is the probability of transmission per sex 

act, W is the weighted adjacency matrix listing the number of interaction events per time step between all 
pairs of nodes, and j are the infected nodes connected to node i. Infected nodes recover at rate 7. For each case 
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analyzed, 20 nodes were initially infected uniformly at random in a population of 10000 in replicate simulations 
and networks. 
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A Supporting Information 



A.l Equations for the epidemic model on weighted networks 

We consider an epidemic of a disease that is transmitted at a rate, f3, and from which infected individuals 
recover at a rate 7. Susceptible individuals with k contacts and I interaction events per time interval are infected 
at a rate proportional to f3, 1 and the probability that a susceptible individual's contact is made with an infected 
individual psi- This leads to the following equations for the evolution of the number of susceptible and infected 
individuals with k (infectious) contacts and / interaction events per time 

Ski = -PpsiISm (7) 
hi = +PPsilS k i - jhi (8) 
Rkl = ihu (9) 

for which a detailled overview of the model's notation and parameters is given in Table [T] 

Adding up the contributions for all k and I introduces the average number of interaction events per time 
and susceptible individual (l)s = Yli IPskl = J2i ^TT mto tne equations. This average number can also be 
expressed in terms of probability generating function Gs(x, y, t) = Y^kl Pski{t)x k y l of the joint probability 
distribution to find k contacts and / interaction events per time among susceptible individuals Pski- {l)s = 
Sfcz^^" = ^s'kl) The (0,1) exponent of Gs indicates the orders of the partial derivatives with 
respect to the first and second argument of Gs (see Table [Tj. 

Summation of S^i and I^i over k and / results in equations for the total number of susceptible and infected 
hosts: 

S = -(3p S iSGf l \l,l,t) (10) 

I = /3 P5/< s4°' 1) (l,l,t)-7/ (11) 
R = 7/ (12) 

To close this set of equations we also need to derive equations for psi, as well as for the probability 
generating function (PGF) Gs{x, y, t). 

We begin by deriving the temporal dynamics of the probabilities for a link starting from a susceptible in- 
dividual to point to a susceptible or infected individual, pss and psi, respectively. Following the argument 
in ll23l we write pss = and psi = Msi/Ms to express these probabilities in terms of the num- 

ber of links/contacts among susceptible and infected hosts (Mg$, Mgj) and total number of links/contacts of 
susceptible hosts (Ms). From this, we get: 



pss = -Ms--mS Pss (13) 
Msi M s 

PSI = -mJ-mS Psi (14) 

for which expressions are derived in the following paragraph. From the definition of Ms, we can derive the 
following equation: 

M s = J2 k ^ ^ 

k.i 
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After substitution from equation ^ this results in 



M s = -PpsiSG^ihht) (16) 



We then follow the arguments made in an earlier study E3I . which rely on the assumption that the number 
of contacts from susceptible hosts to susceptible, infected and recovered hosts is multinomially distributed with 
probabilities psi, pss an d Psr = 1 — Pss — Psi- We also assume that the same applies to the number of inter- 
action events/sex acts per time interval. If a node with k contacts has j contacts with susceptible individuals and 
i contacts with infected individuals its interaction events with susceptible, infected and recovered individuals 
n ss> n>SI and nsR, respectively, are distributed according to 

11 fj\ nss fi\ nsi fk-i-i\ nsR 



nss ] -nsi ] -nsR ] - \k J \k J \ k 

with averages (nss) = 3 ^( n si) = f and {u S r) = &=jpM . Note that I = n S s + nsi + n S R. 

The probability that a susceptible node with j, i and k — i—j contacts to susceptible, infected and recovered 
individuals, respectively, is reached from an infected node, i.e. chosen with a probability proportional to the 
average number of sex acts with infected nodes ((nsi) = \ ) is then 

Pkl i^k-i-^ PssPsiPsR' 1 ( n Sl) 



PSiGlW (1,1) 

m (k—l)l j i— 1 k—j—i 

(n>S£=f lr M (i-iy.(k-l-(i-l)-j)\PssPsi Psr 



(18) 
(19) 



Note that the denominator can be derived using the observation that 



E ™™ E (,; _ - I'-'i - 1) - jy P'ssP^PsT' m 
k,l i,j<k v ' y v 

p kiKPss + Psi + Psr)^ 1 (21) 



k,i 

P5/^^=P5/G(°' 1 )(1,1) (22) 

kl 



Therefore, the probability distribution for the contacts and potential interaction events of a node which was 
chosen proportional to (nsi) = f is generated by 



Efcz l p kiy l Ei,i<fc (»-i)i(fc-i- (pSPSsVfaPSlY l {xRPSR) k 3 1 

G(°.i)(l,l) 
Hkl lP kl{xsPSS + X!p S l + xrpsrY" 1 

xry G^jxspss + xipsi + x R p S R, y) 

xsPss + xipsi + x r psr G^- 1 ) (1, 1) 



(23) 
(24) 
(25) 



Choosing a node proportional to its average number of interaction events per time ((nsi) = il/k) instead of 
the actual number of interaction events (nsi) implies the assumption of a time scale at which I S> k, i.e. a case 
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in which fluctuations around (nsi) can be expected to be small. Taking all this together, the average degrees of 
a susceptible node that was reached from an infected node to susceptible or infected nodes are 



Ssi(S) 



d 



xiy 



Gg'^ (x s pss + xipsi + xrpsr, y, t) 



dx s x s pss + xipsi + xrpsr 



gP(M) 



x s =xj=x R =y=l 



PSS 



Ssi(I) = 



(1,1, t) 
xiy 



G 



(0,1) 



Gg' 1] {xspss + xipsi + xrpsr, y, t) 



dxi x s pss + xipsi + xrpsr 



Gf(M) 



x s =x I =x R =y=l 



PSI 



4 M) (i,M) 



.^(LM) 



1+1 



(26) 
(27) 
(28) 
(29) 



The average number of contacts to susceptible and infected nodes need to be discounted by one for the number 
of contacts with infected nodes to take the contact to the source of infection into account (directly considering 
this in the PGF gives the same result), i.e. the total excess degree of the afore described type of node is 



G 



_s 

.(0,1) 



(1,1,*) 

cr ; (i,M) 



1. 



Bookkeeping of the changes in the numbers of links among susceptible and infected hosts due to the epi- 
demic process results in: 

Changes due to epidemic spread 



Ms i = -S(psi -pss) 



G' 



(i,i) 



(1,1,*) 



G 



(0,1) 



(1,1,*) 



- 1 



-/3 



G 



(0,1) 



(1,1,*) 



G 



(1,0) 



(1,1,*) 



M S i 



-jM S i 



: change in the number of susceptible nodes S = 
—fipsiSG^'^ft, 1, t) due to the epidemic times their average 
excess contacts to susceptible and infected nodes 

: discount for link along which the infection spread 

: link loss due to disease progression 



M ss = S2p ss 



G^'^l.M) 

g^v,!,*) 



: change in the number of susceptible nodes S = 
— PpsiSG^g' 1 ^ (1, 1, t) due to the epidemic times their average 
excess contacts to infected nodes (bi-directional) 



In summary this results in 



Msi = PpsiS (4 0,1) (1 ; M) ~ 4 M) (1> M)) (psi-Pss) - P G f^ ht) Msi - jM s 

G s ' (1,1, t) 

M ss = -2/3^ 5 p 5/ 5(G^ 1) (l,l,t)-4°' 1) (l,l,t)) 
which finally leads to 



/ (30) 
(31) 



PSI = PPSIPSS— ZnTvT, PPSlO- ~ PSl) J, m , : ~ IP SI (32) 



^(0,1), 



.(0,1), 



Pss 



-Ppsipss 



4 1i0) (i.m) 
4 1 ' 1) (i,i,t)-24°' 1) (i,i,t) 

4 1,0) (i,m) 



4' 0) (i,m) 



(33) 
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To close the set of equations we also need to derive an equation for the probability generating function (PGF) 
Gs(x,y,t), which corresponds to the probability to find individuals with k contacts and I interaction events 
(e.g. sex acts) per time interval among susceptible hosts, i.e. Psm- From the definitions of the PGF and Psm = 



, we obtain 



G s (x,y,t) = El^-l^/UV, (34) 



k,l 



which results with equations [7J and 10 



m 



G s (x,y,t) = p Psi (G%' x >(l,l,t)G s (x,y,t)-yG%' XJ (x,y,t)). (35) 

The probability generating functions Gj(x, y, t) and Gr(x, y, t) can be derived analogously (though they are 
not needed to close the system of equations). 



A.2 The basic reproductive ratio R 

The basic reproductive ratio Rq of an SIR epidemics with transmission rate (3 and recovery rate 7 on a classical 
network can be derived as @ 

^0 = / f 1 " (36) 

l\ (37) 



5 (2) (1) 






Jo 


" ( 




/3 + 7 \ 





and is the product of the average excess degree of a node which was reached according to its degree and the 
transmissibility, i.e. the probability that an infection is spread along a link before recovery (here /3/(/3 + 7)). 
These terms do not factorize in the case where there are I interaction events per node defined through the joint 
probability distribution Pjy. To derive Rq for this case we first derive the excess degree distribution Qm of a 
node that was reached with probability I and having k excess contacts and I interaction events per time interval 

Qkl = (38) 

and get for I interaction events multinomially distributed among the k + 1 contacts (with probabilities p\ = 

■■■ = Pk+i = jrpx, rni + ... + m k+ i = I) 

POO 71 ^ 

k,l U m lr ..,m k+ i j=l 

= E E ^w^'-'^'E^*' < 40 > 
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The SI model is trivially included in the SIR network models in the limit 7 — > for which Rq can be derived 



from equation (40): 



*o = E E mi , l ' pT---pZTkQ kl (4i) 

k,l m 1 ,...,m k+1 
k,l 

G( 1 ' 1 )(l,l)-G(°' 1 )(l,l) 



(JW) - (I) _ (kl) 



(43) 



<0 (0 1 (44) 

As long as infected individuals stay indefinitely infected, Rq is not affected by the transmission rate and it 
measures whether there is a giant connected component in the network. Still, even then the effect of weighting 
of the contact network through the number of interaction events I is noticed. 

The basic reproductive ratio Rq for a SIR model on a weighted network can be approximated by 

71 r 00 -— 

k~l mi ,~m k+1 mi ' • • - m fc+l + 70 ^ 

- E E (46) 

fc,i mi,...,m fe+1 x ""^ " u v ' ' 

= E^^^r^a-e-^dt (47) 



(i)/3 + 7 G(o.i)(!,i) 



(48) 



Note that for the linear case with P/i fc = we obtain G(x, y) = J2k 1 x V Pk^kl = Yjk( x V, ) **k = G(xy) 
and Rq = gtiyM . which is consistent with earlier findings. 



A.3 The recovery of the classical equations in the linear case P M = P k S 



ki 



The set of equations for the weighted networks (equations 10 12|32||3~3|3~5 1 includes the case of classical net 



work epidemic models, i.e. the linear case where k = I or P^i = Pk$ki- Focussing on the degree distri- 
bution among susceptible hosts P$k with probability generating function gs{x), the PGF of Psm is given by 
Gs(x, y) = gs{xy). Substitution of Gs{x, y) by gs(xy) results in 

G£' 0) (l,l,t) = g' s (l,t) (49) 

G^iW) = g' s (l,t) (50) 

4 M) (M,i) = 9s(ht) + g' s (l,t) (51) 
and for the time evolution of Gs(x, y): 

gs(xy, t) = Ppsi (g' s (l)g s (xy, t) - xyg' s {xy, t)) . (52) 



19 



Together, this leads to the set of equations for SIR dynamics on a classical configuration type network defined 
by the degree distribution P k E3ll24ll 



S = -(3p S iSg' s (l,t) (53) 

I = P PS iSg' s (l,t)- 1 I (54) 

R = 7 J (55) 

Psi = PpsiPss gf||'^ - Ppsi (1 -Psi) -J Psi (56) 

p SS = -/W S (|£|-l) (57) 

9s{x,t) = Ppsi (g' s (l,t)g s {x,t) - xg' s {x,t)) . (58) 



A.4 Network segregation and the limiting case Pu = Pk${i)i (constant case) 

The analytical approximation assumes that an individual distributes his/her interaction events / multi-nomially 
among his/her k contacts and is infected at a rate proportional to his/her average number of potential trans- 
mission events with i infected contacts. This averaging implies the choice of a time scale at which (l) > (k). 
This leads to an unrealistic network segregation in some artificial networks, specifically for (/) 3> (k), as the 
weights of an individual's contacts level at about l/k and in case contacts are only made (broadly) respecting 
the contact weights. This network segregation changes the epidemic dynamics. As the analytical approach is 
node-centric it does not consider the constraints on half-contacts to match half-contacts of similar weight. In 
consequence, the change in epidemic dynamics due to networks segregation cannot be seen in the analytical 
approach. 

The effect is particularly pronounced in networks with a heterogeneous degree distribution corresponding to 
many individuals with one contact in combination with a constant number of interaction events per individual. 
Degree one nodes have only one contact to assign their interaction events to which leaves their contacts on 
average already with twice the weight seen in individuals with two contacts. This weight separation leads to a 
situation that almost only allows contacts among individuals with a single contact, i.e. monogamous couples 
(contact assortativity). Therefore, individuals with one contact can only be infected if their partner is initially 
infected but not later on through the epidemic process because they are not connected to the giant component 
of the network. In the case of a constant number of interaction events per individual (l) the analytical approach 
decouples into independent equations for all k classes with (l)s = {1} in which epidemic prevalence grows at 
the same rates: 

Sk(i) = -/3psi(l)S k (i) (59) 
4(0 = +/3psi(l)S k (i) ~lh{i) (60) 
Rk(i) = lh(i)- (61) 

Due to the network segregation, epidemic prevalence is reduced in these networks at least by a factor propor- 
tional to the fraction of nodes with a single contact as compared to the standard prediction of the analytical 
approach as they do not participate in the epidemic process. 
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